Polygons

ImportantSummary

This tutorial explore how to handle spatial polygons in R with sf package:

  • read a spatial object with (sf::st_read())
  • calculate area of polygons with (sf::st_area())
  • get the xx_xx (sf::st_distance())
TipThe ecologist mind

In which type of habitats were the otter observed? To answer this question, we will need to discover another type of vector: polygons.

Setup

Follow the setup instructions if you haven’t followed the tutorial on points

If haven’t done it already, please follow the setup instructions. Let’s start with loading the required packages.

suppressPackageStartupMessages({
  library(mapview)
  library(here)
  library(sf)
  library(terra)
})
pt_otter <- st_read(here("data", "gbif_otter_2021_mpl50km.gpkg"))
Reading layer `gbif_otter_2021_mpl50km' from data source 
  `/home/romain/GitHub/spatial-r/data/gbif_otter_2021_mpl50km.gpkg' 
  using driver `GPKG'
Simple feature collection with 69 features and 13 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3.28983 ymin: 43.30373 xmax: 4.46447 ymax: 44.06548
Geodetic CRS:  WGS 84
pt_otter <- st_read(
  "https://github.com/FRBCesab/spatial-r/raw/main/data/gbif_otter_2021_mpl50km.gpkg"
)

Load polygons from a shapefile

In practice, most polygons come from existing spatial datasets (apart of grid addref). In this example, we will load land use land cover information for the area of interest from IGN data BD CARTO.

Note that this dataset has rough resolution (OSO or Corine land cover would be more suited for real analysis), but it’s perfect for our illustration and learning purposes.

You can load all vector dataset in R with the function sf::st_read().

landuse <- st_read(here("data", "BDCARTO-LULC_mpl50km.shp"))
Reading layer `BDCARTO-LULC_mpl50km' from data source 
  `/home/romain/GitHub/spatial-r/data/BDCARTO-LULC_mpl50km.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 2384 features and 3 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 3.248306 ymin: 43.15835 xmax: 4.495429 ymax: 44.06731
Geodetic CRS:  WGS 84
landuse <- st_read(
  "https://github.com/FRBCesab/spatial-r/raw/refs/heads/main/data/BDCARTO-LULC_mpl50km.shp"
)
NoteYour turn
  1. How many different polygons make the land cover of the area?
  2. What is the coordinate reference system (CRS) of the loaded river data?
  3. How many land cover classes are there? which class has most polygons?
Click to see the answer
  1. There was 2384 polygons in the dataset. You can access it with dim(landuse), nrow(landuse) or just by typing landuse in the console.
  2. The coordinates are in WGS 84 (EPSG 4326). You can access this information with st_crs(landuse).
  3. The column that stored the name of river stretches is nature. You could identify it with head(landuse) or names(landuse). There are 12 land cover classes, Prairie is the class with most polygons (table(landuse$nature)).

Visualization

mapview(landuse, z = "nature") +
  mapview(pt_otter, col.regions = "black")
plot(landuse["nature"], reset = FALSE, border = FALSE, axes = TRUE)
plot(st_geometry(pt_otter), add = TRUE, pch = 16)

Calculate the area

The function expanse calculate the area in \(m^2\). Again, be careful with projection systems (addrefslide). Some are not suited to calculate areas. Prefer equalarea projections or use local projection system (if your study area is small).

The package terra recommends the calculation of areas in lat/long to get more accurate results (accounting for Earth’s curvature). The package sf handle lat/long with s2 which is not very accurate or prone to errors.

So it is recommended to use terra when dealing with geographic coordinates, or projecting the coordinates in an appropriate projection system.

# transform land use as SpatVector
landuse_terra <- vect(landuse)
# calculate the area in ha
area_polygons <- expanse(landuse_terra) * 0.0001
# store the area as atribute
landuse_terra$area_ha <- as.numeric(area_polygons)
# 2154 is a commun projection for France
landuse_2154 <- st_transform(landuse, crs = 2154)
# calculate area of polygons
area_polygons <- st_area(landuse_2154)
# transform in ha
units(area_polygons) <- "ha"
# same as st_area(landuse)) * 0.0001
# store as attribute in landuse
landuse_2154$area_ha <- as.numeric(area_polygons)
NoteYour turn
  • Which is the largest land cover class in our study area?
  • Compare the area calculation of terra, sf.
Click to see the answer
# see area per land use classes
tapply(landuse_terra$area_ha, landuse$nature, sum) #/sum(landuse_terra$area_ha) * 100
              Bâti       Broussailles Carrière, décharge          Eau libre 
       47485.21879        89142.56281         1624.44372       279914.24086 
             Forêt      Marais salant  Marais, tourbière            Prairie 
      180470.59831         3386.96654        23674.11376       160747.53153 
   Rocher, éboulis     Sable, gravier      Vigne, verger   Zone d'activités 
          37.78454         1875.32616       175294.25395        13725.54131 
tapply(landuse_2154$area_ha, landuse$nature, sum) #/sum(landuse_2154$area_ha) * 100
              Bâti       Broussailles Carrière, décharge          Eau libre 
        47509.9590         89177.4533          1625.3989        280220.8782 
             Forêt      Marais salant  Marais, tourbière            Prairie 
       180509.0654          3389.7172         23692.1565        160809.4575 
   Rocher, éboulis     Sable, gravier      Vigne, verger   Zone d'activités 
           37.7996          1876.9749        175391.1066         13734.7157 

Points to polygons

Before making extraction, it is recommended to plot the data (if not too big), to make sure the projection systems are the same and the extents match. Do not use mapview (interactive map) because it will automatically project the data.

Extract the land cover class of the points with sf::st_join():

pt_landcover <- st_join(pt_otter, landuse)

Extract the land cover class of the points with terra::extract():

# transform land use as SpatVector
pt_otter_terra <- vect(pt_otter)
# overlay points and polygon with extract()
pt_landcover_terra <- extract(landuse_terra, pt_otter_terra)
NoteYour turn
  • Which is the most commun land cover classes where otter were observed?
  • [stat] Are there classes that are over or under represented compare to expected?
Click to see the answer
# see area per land use classes of the observations
sort(table(pt_landcover$nature), decreasing = TRUE)[1:5]

        Forêt  Broussailles       Prairie Vigne, verger          Bâti 
           24            14            14             8             7 

Polygons to polygons

In many cases, we want to characterize the area surrounding the observations. So we will create buffers.

Create buffer

Same here, sf is really not good with geographic coordinates (e.g. EPSG:4326) so it’s better to project the points before creating the buffer with sf. with sf::st_buffer().

dist_buffer <- 1000 # buffer of 1000 m
pt_otter_2154 <- st_transform(pt_otter, 2154)
poly_otter_2154 <- st_buffer(pt_otter_2154, dist_buffer)

with terra::buffer().

dist_buffer <- 1000 # buffer of 1000 m
poly_otter_terra <- buffer(pt_otter_terra, dist_buffer)

Visualize buffer

plot(st_geometry(poly_otter_2154)[1], reset = FALSE)
plot(st_geometry(pt_otter_2154)[1], add = TRUE)

plot(poly_otter_terra[1])
plot(pt_otter_terra[1], add = TRUE)

NoteYour turn
  • What is the area of the newly created buffers?
Click to see the answer
# see area per land use classes of the observations
summary(st_area(poly_otter_2154))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
3140157 3140157 3140157 3140157 3140157 3140157 
summary(expanse(poly_otter_terra))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
3128689 3128689 3128689 3128689 3128689 3128689 

The difference is due to the curvature of earth, in projected coordinates we have the planar area (which should theoretically be 3141593 m2), and in geographic coordinates we have geodesic area.

intersection

buffer_landcover <- st_intersection(poly_otter_2154, landuse_2154)
Warning: attribute variables are assumed to be spatially constant throughout
all geometries
# visualize the intersection
mapview(buffer_landcover, z = "nature")
buffer_landcover_terra <- intersect(landuse_terra, poly_otter_terra)

# visualize the intersection
mapview(buffer_landcover_terra, z = "nature")

covers

buffer_landcover$area_ha <- st_area(buffer_landcover) * 0.0001

occ <- tapply(
  buffer_landcover$area_ha,
  list(buffer_landcover$X, buffer_landcover$nature),
  sum
)
#replace NA by 0
occ[is.na(occ)] <- 0
# calculate the overlay area
sum_occ <- rowSums(occ)
# calculate the percentage per class
perc_occ <- occ / sum_occ * 100

# add information in the spatial vector
# pt_res <- cbind(pt_otter, perc_occ)
# mapview(pt_res, z = "Forêt", layer.name = "% foret")
buffer_landcover_terra$area_ha <- expanse(buffer_landcover_terra) * 0.0001

occ <- tapply(
  buffer_landcover_terra$area_ha,
  list(buffer_landcover_terra$X, buffer_landcover_terra$nature),
  sum
)
#replace NA by 0
occ[is.na(occ)] <- 0
# calculate the overlay area
sum_occ <- rowSums(occ)
# calculate the percentage per class
perc_occ <- occ / sum_occ * 100

barplot(t(perc_occ))

# add information in the spatial vector
pt_otter_terra <- cbind(pt_otter_terra, perc_occ)
mapview(pt_otter_terra, z = "Forêt", layer.name = "% foret")